Induction of Modular Classification Rules by Information Entropy Based Rule Generation
نویسنده
چکیده
Prism has been developed as a modular classification rule generator following the separate and conquer approach since 1987 due to the replicated sub-tree problem occurring in Top-Down Induction of Decision Trees (TDIDT). A series of experiments have been done to compare the performance between Prism and TDIDT which proved that Prism may generally provide a similar level of accuracy as TDIDT but with fewer rules and fewer terms per rule. In addition, Prism is generally more tolerant to noise with consistently better accuracy than TDIDT. However, the authors have identified through some experiments that Prism may also give rule sets which tend to underfit training sets in some cases. This paper introduces a new modular classification rule generator, which follows the separate and conquer approach, in order to avoid the problems which arise with Prism. In this paper, the authors review the Prism method and its advantages compared with TDIDT as well as its disadvantages that are overcome by a new method using Information Entropy Based Rule Generation (IEBRG). The authors also set up an experimental study on the performance of the new method in classification accuracy and computational efficiency. The method is also evaluated comparatively with Prism.
منابع مشابه
Entropy Based Fuzzy Rule Weighting for Hierarchical Intrusion Detection
Predicting different behaviors in computer networks is the subject of many data mining researches. Providing a balanced Intrusion Detection System (IDS) that directly addresses the trade-off between the ability to detect new attack types and providing low false detection rate is a fundamental challenge. Many of the proposed methods perform well in one of the two aspects, and concentrate on a su...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملAutomatic Induction of Classification Rules from Examples Using N-Prism
One of the key technologies of data mining is the automatic induction of rules from examples, particularly the induction of classification rules. Most work in this field has concentrated on the generation of such rules in the intermediate form of decision trees. An alternative approach is to generate modular classification rules directly from the examples. This paper seeks to establish a revise...
متن کاملA QUADRATIC MARGIN-BASED MODEL FOR WEIGHTING FUZZY CLASSIFICATION RULES INSPIRED BY SUPPORT VECTOR MACHINES
Recently, tuning the weights of the rules in Fuzzy Rule-Base Classification Systems is researched in order to improve the accuracy of classification. In this paper, a margin-based optimization model, inspired by Support Vector Machine classifiers, is proposed to compute these fuzzy rule weights. This approach not only considers both accuracy and generalization criteria in a single objective fu...
متن کامل